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Abstract. We introduce a new probabilistic technique for finding 'almost-periods' 
of convolutions of subsets of groups. This gives results similar to the Bogolyubov-type 
estimates established by Fourier analysis on abelian groups but without the need for 
a nice Fourier transform to exist. We also present applications, some of which are 
new even in the abelian setting. These include a probabilistic proof of Roth's theorem 
on three-term arithmetic progressions and a proof of a variant of the Bourgain-Green 
theorem on the existence of long arithmetic progressions in sumsets A + B that works 
with sparser subsets of {1, ... , N} than previously possible. In the non-abelian setting 
we exhibit analogues of the Bogolyubov-Freiman-Halberstam-Ruzsa-type results of 
additive combinatorics, showing that product sets Ai ■ A2 ■ A3 and A^ ■ A~^ are rather 
structured, in the sense that they contain very large iterated product sets. This is 
particularly so when the sets in question satisfy small-doubling conditions or high 
multiplicative energy conditions. We also present results on structures in A - B. 

Our results are 'local' in nature, meaning that it is not necessary for the sets under 
consideration to be dense in the ambient group. In particular, our results apply to 
finite subsets of infinite groups provided they 'interact nicely' with some other set. 
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1. Introduction and statements of results 

There are many interesting problems that are concerned with counting various structures 
in subsets of groups. Many of these can be expressed in terms of the operation of 
convolution, defined for two functions f,g:G-^Cona group G to be the function 
f*g given by 

f*g{x) := ^f{y)g{y'^x), 

y€G 

provided this exists for all x G G. For example, many of the central objects of additive 
combinatorics can be expressed directly in terms of convolutions: the product set A-B = 
{ab : a E A, b E B} of two subsets of a group is precisely the support of the function 
1a * Is, where Ix denotes the indicator function of a set X, and the number of three- 
term arithmetic progressions in an additive set A, i.e., tuples (01,02,03) E A x A x A 
with 01 + 03 = 2o2, is 1a * 1-2- a * 1a (0). One may think of a convolution as being a sum 
of a function weighted by translates of another function and, as such, one may hope that 
convolutions are somewhat 'smooth'. Indeed there are various senses in which this is 
true, and having precise notions of what it means can lead to interesting combinatorial 
consequences. Such results are often proved for abelian groups using the beautiful theory 
of Fourier analysis, where one uses the fact that convolutions and Fourier transforms 
interact in a very nice way. In this paper our aim is to demonstrate a new technique 
for establishing results about convolutions that are similar to those of Fourier analysis 
but that work on arbitrary groups, as well as to present applications. 

1.1. Notation. Before we state our results let us introduce some notation — most of 
which is standard — directing the reader to the book [H] of Tao and Vu or the paper 
|13] of Tao for more details and interesting information about the concepts we use. 
Throughout the paper G will denote a group (which may potentially be infinite). For 
two subsets A and S of G we write A ■ B := {ab : a E A, b E B} for the product set of 
A and B, and A^^ for the collection of inverses of elements of A. Sometimes we shall 
omit the • and just juxtapose two sets to indicate the multiplication. For an element 
t of G we write tA := {to : a E A} for the left-translate of A by t and similarly for 
the right-translate At. If is a positive integer then we write A^ := A ■ A ■ ■ ■ A for the 
A;-fold product set of A, and A''' for the fc-fold product set of A~^. For abelian groups 
we write the group operation additively and we give the corresponding definitions to 
A + B,A — B,t + A, kA, etc. The multiplicative energy between two sets A and B is 
defined to be the quantity 

EiA,B) :=5^1a*1s(x)2; 

for abelian groups this is known as the additive energy. For a function / : G — )■ C and 
a real number p ^ 1 we write ||/||^ = ||/(a;)||^ := J2x£G 1/(^)1^ (^^^ power of) 
the norm of / provided this is finite. Thus E{A,B) = ||1^ * 1b||2- ^ final piece of 
terminology: for finite groups G we say that the density of a set A C G is |A|/|G|. 

1.2. The almost-periodicity results. Our first result, then, is the following almost- 
periodicity-type theorem. 
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Proposition 1.1 (L^-almost-periodicity, local version). Let G be a group, let A,B C G 
be finite subsets, and let e G (0, 1) be a parameter. Suppose S C G is such that \B ■ S\ ^ 
K\B\. Then there is a set T C S of size 

iTi> 1^1 



(2i^)9A 



such that, for each t G TT ^, 

\\lA*lBixt)-lA*lBix)\\l^e^\A\\B\\ 



The condition that there should be a set 5* such that |i? • 51 ^ f^l-Bl is what justifies 
the terminology 'local': one does not need B to be dense in its ambient group in order 
to apply the proposition effectively. All one needs is for B to interact nicely with some 
large set S, a condition that we say more about in §2J If one knows little about the 
structure of B one can still obtain useful conclusions from the proposition provided B 
is dense in some structured set. For example, if G = Z and B C [A^] := {1, . . . , A^} 
with \B\ ^ 13 N (a case of interest in many problems) then one may take S = [N] and 
K = 2/ p. Similarly, if G is finite then one can always take S = G, regardless of B, 
which immediately gives the following corollary. 

Corollary 1.2 (L^-almost-periodicity, global version). Let G be a finite group, let 
A, B (1 G , and let e G (0,1) be a parameter. Suppose B has density /3. Then there 
is a set T (1 G of size at least {P/2)^^'^^\G\ such that, for each t G TT~^, 

\\lA*lBixt)-lA*lBix)\\l^e^\A\\B\\ 



On an informal level these results say that convolutions are somewhat continuous: one 
may find a large number of translates t such that the function 1a*^b does not change 
by much — in an sense — when translated by t. Having L^-almost-periods provides 
one with good control in many applications, particularly those involving three-fold or 
higher convolutions, such as when dealing with the number of three-term progressions 
in a set or with a triple-fold product set A - B ■ G. But for certain applications involving 
only a single convolution it turns out that having L^-almost-periods for a somewhat 
large p is more useful. 

Proposition 1.3 (L^-almost-periodicity, local version). Let G be a group, let A,B (1 G 
be finite subsets, and let e G (0, 1) and m ^ 1 fee parameters. Suppose S G is such 
that \B ■ S\ ^ Then there is a set T (1 S of size 



such that 

\\1a * lB{xt) - 1a * lB{x)fZ ^ max (e"^|A5||5r, ||U * l^C) e'^lB 
for each t G TT^^. 



As before, this has the following 'global' corollary. 
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Corollary 1.4 (LP-almost-periodicity, global version). Let G he a finite group, let 
A, B <Z G be subsets, and let e G (0, 1) and m ^ 1 be parameters. Suppose B has 
density (3. Then there is a set T C G of size at least (/3/2)^°'^/'^|G'| such that 

\\1a * Mxt) - U * Ib{x)\\IZ ^ max (e"^|Ai?| |i?r, ||U * e"'\Br 
for each t G TT~^ . 

We give some further variants of the above propositions in ^ In particular one can 
with a slight change to the hypotheses find left-translates instead of right-translates, 
which may be more useful depending on the application. 

Our proofs of the above propositions are of a probabilistic nature, involving a 'random 
sampling' procedure that finds small subsets of one of the sets that behave similarly to 
the set itself (in a precise sense). This procedure is the same regardless of whether the 
group is commutative or not, which places our method in stark contrast to the Fourier- 
analytic methods that are typically the port of call for dealing with almost-periodicity 
in abelian groups. We say more about the abelian versions of the above results and the 
Fourier-analytic methods that lead to them in §21 turning now instead to applications 
of our results. 

1.3. Applications. We shall apply the almost-periodicity results in four directions in 
this paper, namely towards 

(i) non-commutative analogues of the Bogolyubov-Freiman-Halberstam-Ruzsa the- 
ory that shows that sumsets are structured, 

(ii) a low-density version of the Bourgain- Green theorem on long arithmetic pro- 
gressions in sumsets A + B, 

(iii) a probabilistic proof of Roth's theorem on arithmetic progressions and 

(iv) a new result on the approximate translation-invariance of products of so-called 
strong i^-approximate groups. 

We discuss each of these in turn. 

Structures in product sets. A general objective in additive combinatorics is to show that 
sumsets in abelian groups are rather structured objects. A rather useful such result due 
to Bogolyubov [2j that was highlighted by Ruzsa [34j in the additive-combinatorial 
context shows that sets 2A — 2A are highly structured, particularly if A has small 
doubling. For non-abelian groups an analogue of this was recently proved by Sanders 

m 

Theorem 1.5. Suppose G is a group, A G is a finite set such that \A^\ ^ K\A\ and 
k eN is a parameter. Then there is a symmetric set S containing the identity such that 

S'' CA^- A-^ and \S\ ^ exp {-K^'^''^) \A\. 

As noted in [37], this is a variant of a result used in Tao's proof [32] of a Freiman-type 
theorem on the structure of sets with small doubling in solvable groups. Freiman-type 
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results are ones that characterize subsets of groups that are group-hke — more precisely 
subsets y4 of a group G that satisfy a small-doubling condition \A^\ ^ K\A\ or a small- 
tripling condition l^^l ^ K\A\ for some fixed K — and there has been a concerted effort 
in recent years to try to establish such results in various classes of groups. In the 
commutative setting a rather precise and useful such characterization is provided by a 
theorem of Green and Ruzsa [20j that generalizes a fundamental theorem of Freiman 
[16]. In the non- commutative setting a number of interesting results have appeared 
recently [6l [71 El [151 1211 [311 [12] , though there is not yet a unified theory. Let us remark 
in the context of this paper, however, that results of the form of Theorem 11.51 can be 
useful in proving such results: abelian results that find large Bohr sets in 2A — 2A form a 
key step in many proofs of Freiman's theorem, and Theorem 1 1 . 5 1 it self was recently used 
by Green, Sanders and Tao |22] to provide combinatorial proofs of some Freiman-type 
results of Hrushovski pi] . 

The almost-periodicity results of this paper are particularly well-suited to proving results 
of the form of Theorem II. 5 [ and doing so with reasonable bounds. Indeed, the following 
is a virtually immediate consequence of Proposition 11.11 

Theorem 1.6. Suppose G is a group, A C G is a finite set such that \A'^\ ^ K\A\ and 
k & N is a parameter. Then there is a symmetric set S C A~^A containing the identity 
such that 

CA^- A'^ and \S\ ^ exp {-9PK\og2K) \A\. 

Furthermore, each element of has at least lAf" /2K representations as 010203 ^aj^ 
with Oj G A. 

Four-fold product sets of the above form are particularly pleasant to analyze, but it is 
not much harder to obtain a result that works with only triple product sets. To state 
this concisely it is convenient to introduce a small piece of non-standard terminology: 
for a triple {A, B,C) of finite subsets of G and an element x G G we shall say that 
X is 'j-popular if 1^ * 1^ * Icix) ^ 7(|A| |i?|)^/^|C|. That is, x is 7-popular if it can 
be written as a product abc with a G A, b G B and c G C in at least 7(|74| |_B|)^/^|C| 
different ways. If \A ■ B ■ C\ is small then certainly there is a popular element, since 

|A||5||(:7|= lA*lB*lcix) ^\A- B -C] snplA*lB*lcix) 

x£A-B-C 

(see but there are also much weaker conditions ensuring this. 

Theorem 1.7. Let G he a group, let Ai,A2,A^ C G be finite, non-empty sets and 
let k & N be a parameter. Suppose x is a (1/ K)-popular element for {Ai, A2, A^) and 
that there is a set DOG such that {A^ ■ D\ ^ fC'|A3|. Then there is a symmetric set 
S C DD^^ containing the identity such that 

xS^ C A1A2A3 and \S\ ^ exp {-?,Qk'^ K"^ \og2K') \D\. 

In the abelian setting the non-local version of this result is in the same vein as a result 
of Freiman, Halberstam and Ruzsa [T7] that finds long arithmetic progressions or Bohr 
sets in A + A + A (see also [HI Theorem 4.43]); the best bounds currently known in 
this direction are due to Sanders [SB]. 
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For the product of two sets the situation looks rather different, a phenomenon that has 
been observed in many different contexts. Whereas we cannot ensure that we can find 
a translate of a large iterated product set in A - B, it turns out that we can always find 
a translate of any small subset of a large iterated product set. 

Theorem 1.8. Let G be a group, let A, B ^ G be finite, non-empty subsets and let 
k,n be parameters. Suppose \A ■ B\ ^ K\A\ and \B ■ D\ ^ K'\B\. Then there is a 
symmetric set S C DD~^ of size 

|5| ^ exp (-150A;2inog2/s:'log2n) \D\ 

such that the product set A ■ B contains a left-translate of any set P ^ S'' of size at 
most n. 

This theorem is a straightforward consequence of the L^-almost-periodicity of 1^ * 
1 B given by Proposition 11.31 Our next application restricts this result to subsets of 
{l,...,iV}. 

Arithmetic progressions in sumsets A+B. Coupled with a 'structure-generation' lemma 
that finds arithmetic progressions in iterated sumsets kS, Theorem 11.81 quickly yields 
the following. 

Theorem 1.9. Let N be a positive integer and let A,B (1 [N] be non-empty sets of 
sizes aN, jUN . Then A + B contains an arithmetic progression of length at least 

l/4^ 



|exp I c 



alog 
log4//3 



where c> is an absolute constant. 



Results of this form have a rich history, starting with the paper |1] of Bourgain. There 
it was shown, using a very insightful and sophisticated manipulation of sets of Fourier 
coefficients in the group Zp, that if A and B are subsets of [A^] of densities a and /3 then 
A + B must contain an arithmetic progression of length at least 

exp (c((a/31ogA^)^/=^ -loglogA^)) (1.1) 

for some absolute constant c > 0. This bound was improved by Green [TS] using a 
different Fourier-analytic argument to the best bound that is currently known for high- 
density sets, increasing the exponent 1/3 above to 1/2; a similar bound has since also 
been established by Sanders p6] using another Fourier-analytic technique. By contrast, 
our result yields somewhat shorter arithmetic progressions for high-density sets (where 
a and (3 are thought of as not depending on N) but is also able to deal with sets that 
are much smaller than previously possible. Whereas the previous bounds for the length 
of the arithmetic progressions one can find m. A + B are only non-trivial provided 
a/3 ^ C(loglog A^)^/log for some absolute constant C, Theorem 11.91 requires only 
a(log4//9)~-^ ^ C/\ogN. Thus, whereas at least one of the sets had to have density 
at least C loglog iV/(log A^)^/^ with previous bounds, the above theorem allows us to 
deal with pairs of sets each of which may have density as low as C log log N/ log A^. In 
fact, one of the sets may have density as low as exp (—(log A^)'^), which illustrates a 
significant difference between our results and the Fourier- analytic ones. Our proof also 



FINDING ALMOST-PERIODS PROBABILISTICALLY 



7 



adds another novelty: we are able to work directly in the group Z, never needing to 
embed the sets in a group (as is typical). We are also able to give a local version of 
the result; we present this and the proofs in §21 



Roth's theorem. Our next application concerns the quantity r^lN), the largest size of 
a subset of the integers {1, . . . , A^} that is free from non-trivial three-term arithmetic 
progressions — that is, triples {x,x + d,x + 2d) with d ^ 0. As a consequence of our 
probabilistic proof of Proposition 11.11 we are able to establish the following version of 
Roth's theorem [32] by completely combinatorial means. 

Theorem 1.10. There is a function u with u{N) — )■ oo as N ^ oo such that 

N 



r3(iV) ^ 



(log log N)^(^) 
for any positive integer N. 



This bound for r^ is marginally stronger than Roth's original rs{N) <^ A^/ log log A^, 
the beautiful Fourier-analytic proof of which has become a model argument in addi- 
tive combinatorics. Subsequent Fourier-analytic arguments have demonstrated better 
bounds for r^: the best bound currently known is due to Bourgain, who in [5] established 
that 

■^'(-)«^- 

Roth's theorem has enjoyed many different proofs, including non-Fourier-analytic ones, 
and each new proof has typically offered a slightly different perspective on the problem. 
However, only the Fourier-analytic proofs seem to have given decent bounds: the meth- 
ods that have not used Fourier analysis have generally been accompanied by tower-type 
bounds, establishing only that 

rsiN) <^N/log* N; 

see [m Chapter 10] for references, as well as the more recent work [30] • (The iterated 
logarithm of A^, log* A^, is defined to be the number of times it is necessary to take 
the logarithm of A^ in order to get a number less than or equal to 1, and thus grows 
extremely slowly.) It is therefore perhaps of interest that our method manages to give 
bounds of a similar quality to the Fourier-analytic proofs despite not using Fourier 
analysis. We give the proof of Theorem 11.101 in ^ 



Strong approximate groups. We present one final application of the probabilistic tech- 
nique: the following result says that products of certain 'group-like' sets must have 
strong almost-periodicity properties. 

Proposition 1.11. Let A be a finite subset of a group and let e G (0, 1). Suppose A has 
the property that every x ^ A? has at least \A\/K representations as ab with a, 6 G A. 
Then there is symmetric set S C A~^A of size 

\S\ ^ exp (-ir2log2inog8/e) \ A\ 

such that, for each t & S, 

\tA^ /\A^\ ^ e\A^\. 
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Green [12] has suggested that one might call sets A that satisfy the hypothesis of 
this proposition strong K -approximate groups^ that is, A is a strong f^-approximate 
group if 1a * 1a{.x) ^ 1^1/-^ for sach x G . Clearly any subgroup of a group is a 
strong i^- approximate group with K = 1^ but there are more complex examples. For 
example, if p > 3 is a prime congruent to 3 mod 4 then the set A C Zp consisting of 
non-zero squares is a strong (| — o(l))-approximate group, ioi A + A = 'Lp \ {0} and 
1a * 1a{x) ^ {\ ~ o{l))\A\ for each non-zero x G Zp. Note also that if A C G and 
B (1 H are strong Ka- and f^^-approximate groups then AxBCGxH is a strong 
i^^i^B-approximate group. We make some further remarks about strong approximate 
groups in ^ 

The remainder of this paper is laid out as follows. In the next section we describe 
some standard background material from the subject of arithmetic combinatorics. In 
^we outline the basic idea behind our method and present the proofs of our almost- 
periodicity results. The proofs of the results on structures in product sets are very 
short and we give them immediately afterwards in §U In ^we establish a structure- 
generation lemma that allows us to pass from arbitrary sets of translates in abelian 
groups to structured sets of translates. In ^ we give the proof of Theorem 11.91 on 
arithmetic progressions in sumsets, and in ^ we present our proof of Roth's theorem. 
We present the proof of Proposition 11.111 on strong approximate groups in ^ and we 
close in ^ with some further remarks, including a comparison with Fourier-analytic 
results. 

1.4. Acknowledgements. We would like to thank Tom Sanders for many interesting 
and helpful conversations relating to several of the results of this paper. The second- 
named author is grateful for the support of an EPSRC Postdoctoral Fellowship, enjoyed 
while part of this work was carried out. 

2. Preliminaries on convolutions and product sets 

In this section we record some useful standard results about convolutions and product 
sets; it may be largely skipped by those familiar with additive combinatorics. We follow 
Tao Hg. 

For functions on abelian groups the operation of convolution is commutative; this is not 
true in general for non-abelian groups. Convolution is, however, always bilinear and 
associative. A crucial link between convolutions and products is that the support of 
1^1 * ■ ■ ■ * lAfe is the product set Ai - ■ ■ Ak. More precisely, 

1^1 * ■ ■ ■ * lAfc(a;) = |{(ai, . . . ,ak) e Ai X ■ ■ ■ X Ak : ai- ■ -ak = x}\; (2.1) 

convolutions thus count how many representations an element of a product of k sets 
has a product of elements of the k sets. For pairs of sets one also has the interpretation 

1a*1b{x) = \AnxB~^\ = \BnA~^x\. 

For functions this change between left-translates and right-translates is illustrated by 
the reflection property 

7^9 = 9*1 (2.2) 
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where f{x) := f{x ^). Note that Ix = Ix-i- Since convolutions are counts, sums of 
convolutions are also counts, and a full sum counts a particularly simple quantity: 

J2U,*---*Ia,{x) = \Ai\---\Ak\. (2.3) 

Many results in this paper involve conditions on the cardinalities of product sets. For 
two finite sets A and B in a group G, one always has the inequalities 

max{\A\,\B\) ^ \A ■ B\ ^ \A\\B\ 

with equality possible in various scenarios. Of course ^ \G\ as well. Of particular 

importance to this paper are the cases when the product set A ■ B is small, though 
precisely what this means will depend on the context. Generally we shall say that 
1^4 -i?! is small if it is at most K\A\ or K\B\ for some fixed number K, i.e., if it is within 
a constant factor of being as small as it could be. One generally thinks of a condition 
|A • S| ^ K\B\ as showing that A and B share some structure, particularly if A and B 
are close in size. In particular this implies that A and B must themselves be somewhat 
structured, as follows from ^431 Lemma 3.2]. 

Lemma 2.1 (Ruzsa triangle inequality). Let A, B,C G be finite, non-empty subsets 
of a group. Then 

Our almost-periodicity theorems are thus particularly effective when one of the sets A 
and B is structured in the sense of having small doubling \A'^\ ^ K\A\ or \B'^\ ^ -ft'l-Bl, 
or small differencing \A ■ ^ K\A\ or \B ■ B~^\ ^ for some small, fixed K. In 

abelian groups the following result is particularly useful for bounding sizes of sumsets; 
see [m Chapter 6] for references and a proof. 

Theorem 2.2 (Pliinnecke- Ruzsa inequality). Let A and B be finite subsets of an abelian 
group, and suppose \A + B\ ^ -f^l^l- Then 

\nB-mB\ ^ K'^+'^lAl 

for all integers m,n ^ 1. 

As previously noted, however, one can substitute the above notion of structure for a 
much weaker one: that of being dense in a structured set (such as the ambient group). 
There are other ways in which one can weaken the notion of structure used; recall our 
definition of the multiplicative energy between two sets: 

E{A,B) = J2U*Ib{x)^. 

If the product set A - B is small compared to either A or B, in the sense that it has size 
at most K\A\ or K\B\, then E{A,B) is large: 

E{A,B)^-^(j2l^*ls{x)] =^^, (2.4) 
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where the inequahty follows from the Cauchy-Schwarz inequality. On the other hand, 
the condition E{A,A) ^ |y4|^/i^ need not imply that {A^l is small, even in the abelian 
setting. We mention that there is a partial converse, however, that could be used in 
certain applications to keep the effectiveness of the bounds of this paper in the case 
when the sets in question have high multiplicative energy instead of small doubling: 
this is known as the Balog-Szemeredi-Gowers theorem. We point the interested reader 
to [1^ Chapter 2] and |l3l Section 5] for more information on this. 

Many of the above properties have analogues for functions more general than indicator 
functions, of course. The distinction between indicator functions and more general 
functions tends not to be particularly important in practice; see the comments in ^ 



3. Proofs of the main propositions 



Each of our propositions on almost-periodicity has to do with finding translates by 
which the convolution 1^ * 1^ is approximately invariant in some norm. There are two 
basic ideas behind the proofs of these propositions. The first is that if one selects a 
small random subset C O A, then with high probability the convolution Ic * 1b will 
approximate the function j^l^ * Is- This means that the approximation will hold for 
many subsets C of A; so many, in fact, that there must be some relations amongst the 
sets: lots of them must in fact be translates of one another, which is the second idea. 
The translates so obtained correspond to translates that leave l^i * 1^ approximately 
invariant (in the appropriate norm). 

Surprisingly little background is needed to prove Proposition II. Ij all we shall assume 
is some basic familiarity with the probabilistic method — see for example [T] or for 
more details on this. We shall prove the following equivalent version of Proposition ll.lt 
the equivalence follows immediately from the reflection identity (12. 2p . 

Proposition 3.1 (L^-almost-periodicity, left-translates). Let G be a group, let A,B^ 
G be finite subsets, and let e G (0, 1) 6e a parameter. Suppose S ^ G is such that 
\S ■ A\ ^ -^1^1- Then there is a set T C S^^ of size 

\S\ 



\T\ > 



(2ir)9A 



such that, for each t G TT ^, 

\\1a* Isitx) - 1a* lBix)\\l ^ e^\A\^\B\ 



Proof. Let k be an integer between 1 and \A\/2 that we shall fix later and let C be 
a random subset of A of size k, chosen uniformly out of all such sets. Let us write 
/ic '■= Ic ■ for a normalized version of the indicator function of C. It is easy to 

see that E/ic *1b{x) = 1a* 1b{x) for each x G G and that the variance 

Var(/ic' * iBix)) = Elfic * Isix) -1a* Isix)]"^ 

satisfies 

Var(/ic * Isix)) ^ * iBix). 
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Summing this inequahty over all x & A ■ B, the support of 1a* 1b, we obtain 

E\\fic * - 1a * lBix)\\l ^ \A\^\B\/k. (3.1) 

Let us say that a set C G (^) approximates A if the bound 

ll/ic * Mx) - Ia * lB(a;)||2 ^ '^\A\^\B\/k 
holds. By (13. 1|) and Markov's inequahty we thus have that 

P^g(^A^)(C approximates A) ^ 1/2, (3.2) 

where F^^^x^ refers to the uniform distribution on fc-sets in a set X. 

We now consider fc-sets C chosen uniformly at random from Y := S ■ A instead of A. 
Lette S-^. Clearly 



F^,^^Y^^(tC approximates A) = P^gj^ty^(C approximates A), 

and since A 'OtY we see that this is at least 

\A\\ f\S ■ A\\ 

k )\ k ) ^C'^i'^)^^ approximates A). 
By (13.21) and the hypothesis that 15 ■ A| ^ i^l^l, then, we have that 

Pce(^)(^<^ approximates A) ^ j^Jq^' 
Summing this inequality over all t G S^^ thus gives 

E^g^y^|{t G S^^ : tC approximates A}\ ^ (^2KY " 

In particular there exists a set C for which the set 

T := {t G S^^ : tC approximates A} 
has size at least \S\/{2K)^. For this set C we have 

Wfxc * Mx) - U * lBitx)\\l ^ 2\A\'\B\/k 

for each t whence 

IIU * Uitx) - U * 1b{x)\\1 ^ 8\A\^\B\/k 

for each t G TT^^ by the triangle inequality. The proposition now follows upon choosing 
k := [8/e^]. (Note that the conclusion of the proposition is trivial if > |y4|/2.) □ 

We need to argue only a little more subtly in order to establish the analogous estimate for 
higher norms: we just make use of higher moments than the variance. In order to do 
this we shall need some more information about random variables of the type Ic * 1b{x) 
considered above. Since Ic * 1_b(x) = \C flxi^^^l, a moment's thought reveals that this 
random variable follows a hypergeometric distribution: a random variable X is said to 
follow a hypergeometric distribution with parameters A^, M and k if 

, /M\ fN -M\ I fN^ 



j / \ j / ' \k 
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for each integer j ^ 0. Thus one may think of X as counting the number of marked 
objects one obtains when selecting k objects randomly and without replacement from 
a population of objects, a total M of which are marked. The proof of the fol- 
lowing bounds on the moments of hypergeometrically distributed random variables is 
elementary, though somewhat tangential to our main arguments, so we postpone it till 
Appendix |Al 

Lemma 3.2. Let m ^ 1 and suppose that X follows a hypergeometric distribution with 
parameters N , M and k as above. Then 

E|X-^p-^2(3m^ + m2)'". 



With these estimates in hand the proof of Proposition 11.31 is straightforward. Again we 
prove the following trivially equivalent version. 

Proposition 3.3 (L^-almost-periodicity, left-translates). Let G be a group, let A,B(1 
G be finite subsets, and let e G (0, 1) and m ^ 1 be parameters. Suppose S G is such 
that \S ■ A\ ^ ^\^\- Then there is a set T C S^^ of size 

\S\ 



such that 

\\Ia * Isitx) - Ia * lB{x)fZ ^ max (e™|Afi||Ar, ||U * Ib\0 e'"!^! 
for each t G TT^^ . 



Proof. We follow the proof of Proposition 13.11 letting C be a random subset of A of 
size k for some k that is to be fixed. Fix an element x E G. As alluded to above, the 
random variable Ic * 1_b(x) follows a hypergeometric distribution: 

'M\ (\A\-M\ I (\A\ 



nic*iB{x)=j] 



where M := 1a* 1b{x) = \Ar\ xB^^\, the probability being nothing but the proportion 
of fc-sets C in A that contain precisely j elements from A fl xB^^. Lemma [3.21 therefore 
tells us that 



E|lc * Ib{x) - * ^ 2 (3mA; ■ U * Ib{x)/\A\ + m^) 



2\m 



or, using the notation yU^ := -^1 



2m 



E|/ic * - 1a * lBix)r ^ 2{m\A\/kr (3 ■ U * Mx) + m\A\/k) 

Summing over all x E A ■ B then yields 

EWfic * Ib{x) - 1a * lB{x)fZ ^ 2{m\A\/kr J] (3 ■ U * Mx) + m\A\/k) 



m 

5 



X&AB 



the right-hand side of which we denote by A. From this it follows by Markov's inequality 
that 

P(||/ic * 1b{x) - 1a * lB{x)fZ ^ 2A) ^ 1/2. 
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We may now argue exactly as in the proof of Proposition 13.1^ replacing the L^-version 
of approximation there with this L^^-version. We thus obtain a set C C 5* ■ A of size k 
such that the set 

T := {t G : ||^c * 1b{x) - U * lB{tx)\\lZ ^ 2A} 

has size at least |S'|/(2i^)'^. The result now follows from the triangle inequality upon 
noting the bound 

A ^ 2(m|A|/fc)"^3.05"'max(||lA* lij|U,20m|Afi||A|/A;)" 

and choosing k := [49m/e]. □ 

Remark 3.4. We have not attempted to optimize the constant 50 that appears in the 
exponent of the density of the set T in this proposition; one can certainly reduce it, 
though any such reduction would be largely irrelevant for our applications. 



4. Structures in product sets 



In this section we provide proofs of the applications discussed in the first part of §1.31 
These results were all versions of the statement that product sets are structured objects, 
with various meanings. Theorem 11.61 said that sets ■ y4~^ are structured in the sense 
that they contain large iterated product sets; this is perhaps the most straightforward 
consequence of Proposition 13. It 

Proof of Theorem \l.(X Set e := l/k\/li and apply Proposition 13 . 1 1 to A with B = S = A 
to obtain a set T C of size at least \A\/{2Kf^^^ such that 

for each t G TT~^ . Write S := TT^^. By the triangle inequality we then have 

\\lA*lA{tx)-lA*lA{x)\\l ^ 

for each t & S^. The left-hand side of this inequality can be expanded as 
2 ^ U * lA{xf - 2 ^ U * lA{tx)lA * 1a{x) 

x&G xeG 

= 2 {E{A, A)-1a*Ia* U-1 * lA-i(t)) • 

Since A has small doubling, it also has large multiplicative energy by fl2.4p : E[A, A) ^ 
\A\^/K. Hence 

Ia*Ia* * > iMK - 1/2K) ^ \A\^/2K. 

Since 1a* ^a* ^a-^ * ^a~^ has support A^ ■ A'"^, we thus have that S*^ C ■ A'"^ as 
desired. Furthermore, each element t G S*^ has many representations as products in the 
way claimed, as follows from (12.11) . □ 



We record the following more general version of Theorem II. 6t the proof is the same 
except we do not specialize all the parameters when applying Proposition 13. 1[ 
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Theorem 4.1. Let G be a group, let A, B C G be finite, non-empty subsets and let 
k & N be a parameter. Suppose E{A,B) ^ |y4p|i?|/ir and that \D ■ A\ ^ -^'|^| for 
some set D (1 G. Then there is a symmetric set S C D^^D containing the identity such 
that 

(ZA-B-B-^ ■ A-^ and \S\ ^ exp {-'^k'^ K \og2K') \D\. 

Furthermore, each element of has at least \ A\'^\B\/2K representations as aibib2^a2^ 
with ai & A, bi & B . 

Note that this reaUy does generahze Theorem 11.61 by f l2.4p . 

Theorem 11.71 deah with the product of three sets under the assumption of the existence 
of a 'popular element'. Note that there are various conditions that will ensure the 
existence of a popular element for a triple of sets {A, B,G): A ■ B ■ C being small will 
certainly do, as will ||1a * 1_b * Iclh being large. The condition E{A,B) ^ 
is also a popularity-type condition, E{A, B) equalling 1^ * 1_b * 1b-i * l^-i(l), and the 
pigeonhole principle shows that if the multiplicative energy E{A,B) is large then there 
is a popular element for the triple {B, B~^, A~^). 

Proof of Theorem \1.7\ Recall that we are given three finite sets Ai, A2 and ^3, a 'pop- 
ular' element x such that 

1a, * 1a, * UAx) > {\A,\\A2\)'/'\A,\/K, 

and a set D such that |y43 • Z}| ^ fC'IAal. Apply Proposition II . 1 1 to the sets A = A2 and 
B = A3 with e := l/2kK to obtain a set T C of size at least \D\/ {2K'f^''^^^ such 
that 

\\1a, * UM) - U * UMWl ^ e'l^2p3r 

for each t G 5* := TT^^. Thus for each t E we have 

IIU, * U,{yt) - * Us{y)\\l ^ 1^211^3174^^'- 

Let te SK Then 

lUi * 1^2 * ^Asixt) - I A, * Ia2 * (a;) I 

= Yl ^^i(^) (^^2 * ^A-Ay~^xt) - 1^2 * lA^iy'^x)) 

y&G 

^ |Ai|l/lU,*U3(yt)-lA2*lA3(l/)||2, 

the inequality being an application of the Cauchy-Schwarz inequality. Thus 

\Ia, * Ia, * U-A^t) - Ia, * Ia, * Ia,{x)\ ^ (|Ai||A2|)^/V3|/2ir. 

Since a; is a (l/i^)-popular element and t E was arbitrary, this completes the proof. 

□ 

We turn now to the case of two sets. Theorem 11.81 is a special case of the following 
result, which has the advantage of giving stronger results in the situation when |v4 • i?| 
is not small but the multiplicative energy E{A., B) is still large. 
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Theorem 4.2. Let G be a group, let A, B C G be finite, non-empty subsets and let 
k,n G N be parameters. Suppose that 

(i) E{A,B) ^ \A\\B\yK,, 



\A- 


B\ 


^ K2 


\A\ 




■ s\ 




B\ 



Then there is a set T C S of size 

\T\ > exp {-150k\KiK2y^\\og2K3){\og2n)) \S\ 

such that the product set A ■ B contains a left-translate of any set P C {TT~^)^ ^/ -^^^e 
at most n. 



I m 



Proof. We may assume that n ^ 2. Set m := log2n, define 7 by requiring 1b| 
7'"|AS| and set e := 7/e/c^. Applying Proposition 11.31 to A and B with these 
parameters gives us a set T C S* with 



ITI ^ 



151 



such that 

\\ia * iB{xt) - ia * iBix)^^ ^ nsrwiA * ibIi:;: 

for each t G TT~^. Let P C (TT^^Y be a set of size at most n. Suppose for a 
contradiction that A ■ B does not contain a left-translate of P. Then for every x E G 
there must be an element t E P for which xt ^ A ■ B, i.e., for which 1a * Isixt) = 0. 
Hence 



nfc2-e™|i?n|U * IbWZ ^ J^IIU * Isixt) - U * 1b{x) 

^J2^a*1b{x)'"'. 



1 2m 
1 2m 



By the Cauchy-Schwarz inequality this is at least * lB||^/|y4i?|. Recalling the defi- 
nition of e and m then gives the desired contradiction; hence there must be some element 
X for which xP C A - B. The result now follows upon noting that 7 ^ l/{KiK2Y^'^; 
this follows from Holder's inequality and (12. 3p . □ 

Remark 4.3. The constant 150 in the conclusion should not be taken seriously; it can 
obviously be improved. 

Remark 4.4. If |v4-i?| is not small compared to 1^41 then the conclusion of the theorem 
becomes much less effective. If one still has the energy condition E[A, B) ^ |y4||i?p/i^ 
and A and B are of a similar size then one can use the Balog-Szemeredi-Gowers theorem 
[131 Theorem 5.2] to obtain large subsets A' 'O A and B' (1 B that one can then apply 
the theorem to effectively; this would yield better bounds than using a large value of 
K2 directly. We omit the details. 
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5. Obtaining structured sets of translates 



While Propositions 11.11 and 11.31 yield very large sets of translates for which 1^ * 1^ 
is approximately translation-invariant, one often needs these sets to be structured as 
well. Indeed, for the abelian applications in this paper we shall need to find arithmetic 
progressions of such translates. With Fourier-analytic methods the existence of an 
arithmetic progression of almost-periods is usually easy to obtain since one usually gets 
a Bohr set of translates, but we do not have this convenience. Instead we shall generate 
the structure by repeated set-addition. 

We say that an arithmetic progression P in an abelian group has length k if it can be 
written as 

P = {a, a + d, . . . , a + {k — l)d} 

for some non-zero element d. Note that this notion may be somewhat degenerate in 
some groups. 

Lemma 5.1. Let G be an abelian group, let S G he a finite subset that satisfies 
1^ + ^1 ^ K\S\ or \S-S\^ K\S\ and let A; e N. Suppose A (Z S satisfies \A\ ^ 5\S\ 
where 

Then the set kA — kA contains a symmetric arithmetic progression of length at least 
2^+1 passing through 0, with non-zero step d E A — A. 

To prove this we require a simple preliminary result. (This is not required if = 1, as 
would be the case if X is a group.) 

Lemma 5.2. Let G be an abelian group and let A (1 G be a finite subset satisfying 
\A + A\^ K\A\ or \A - A\ ^ K\A\. Let keN. Then 

|A-2'= ■ A| < K^''\A\. 

Here we write A ■ A for the dilate {Xa : a G A}. This result is Theorem 15 of Bukh [9] 
specialized to the case A = — 2. We include the short proof for completeness. 



Proof. By the Ruzsa triangle inequality. Lemma 12. H we have that 
\A-2'' -Al^ 



\A-2-A\ 


\2 ■ 


A-2^ ■ A\ 


- ^ - 


\A-2-A\\A 


-2^-^-A\ 




\A\ 






1^1 





Hence 

\A-2''-A\ f\A-2-A\'^^ 



\A\ " V 1^1 

The lemma then follows from the instance — 2 • ^ K'^IAI of the Pliinnecke- Ruzsa 



inequality. Theorem 12.21 □ 



Proof of Lemma I5.il It suffices to show that there are distinct elements a, 6 G A for 
which 

2^{a - h) e A - Aioi each j = 1, . . . , /c. 
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for then [—2*^, 2'^] ■ (a — 6) C kA — kA by binary expansion. Upon rearranging, this is 
equivalent to there being distinct a,b E A and elements Xj,yj G A for which 

Xi — 2a = Hi — 2b 
X2 — 4:a = 1/2 — 4:b 

Xk - 2^a = yk- 2%. 

We claim that this system of equations must have a solution with a ^ b. Indeed, for 
each {k + l)-tuple of elements (a, Xi, . . . , x^) G A^^^, set 

/(a, Xi, . . . , Xfc) := (xi - 2a, X2 - 4a, . . . , - 2''a). 

The image of this function is a subset of (S* — 2 ■ S*) x ■ ■ ■ x {S — 2^ ■ S), which by 
Lemma lO has size at most _ft'3fc{fc+i)/2|^|fc_ \A\'^^^ > /^3fc(A:+i)/2|^|fc^ which is the 

case given our bound on S, then there must be two distinct tuples a = (a, xi, . . . , Xk) 
and b = {b,yi, . . . ,yk) in A'''^^ for which /(a) = /(b). Clearly such tuples must have 
a ^ b and so provide a non-trivial solution to our system. □ 

Remark 5.3. One may wish to generate different types of structure depending on the 
group; for example, for problems in Fg it is more natural to generate subspaces instead of 
arithmetic progressions. Establishing such a result in Fg is relatively straightforward: 
it is easy to see that adding a symmetric subset of Fg to itself generates a subspace 
of dimension equal to the number of summands provided the set has enough linearly 
independent vectors. 



The proof of Lemma 15.11 should be compared with an argument of the first-named 
author, Ruzsa and Schoen [12] that finds arithmetic progressions in single sumsets 
A + B, even when A and B are very sparse (much sparser than the sets considered in 
this paper). 



Next we record a combination of Lemma 15.11 and Proposition 11.11 that will be useful to 
us in our proof of Roth's theorem. Recall that [N] := {1, . . . , N}. 

Corollary 5.4. Let 6 G (0, 1) be a parameter and suppose that A C [A^] has size 
aN , where a ^ AN"^'^/^^. Then there is a symmetric arithmetic progression P C 
[-N/2,N/2] of length 

, /52lQg^^l/3^ 



\P\ ^ exp 



14 



log4/ay 



such that G P and, for each t E P, 

\\lA*lAix + t)-lA*lAix)\\l^6^\A\\ 



Proof. Set 

/ 6HogN \ 
V361og4/ay 
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and apply Proposition 11.11 to A with B = A and S = [N]. Note that we may certainly 
take K = 2/a since A + [N] C [2A^]. Thus we get a set T C [A^] of size at least 
{a/4f/'^N that has 

\\lA*lA{x + t)-lA*lA{x)\\l^e^\Af 

for each t E T — T . Now apply Lemma 15.11 to the set T to get a symmetric arithmetic 
progression P C kT — kT of length at least 2^"*"^ + 1. By the triangle inequality we then 
have that any t E P gives 

\\lA*lA{x + t)-lA*lAix)\\l^5^\A\^; 

this progression would thus satisfy the conclusion of the corollary were it not for the fact 
that it may not be contained in [— A^/2]. It is however contained in [—kN,kN], 
and so we may simply select a symmetric subprogression P' C [— iV/2,A^/2] of P of 
length at least 2\_2^~^/k\ + 1; this progression will then do. Note that the condition on 
a comes from the requirement that k be at least 1. □ 



6. Arithmetic progressions in sumsets 



In this section we shall prove Theorem 11.91 Our task is thus to exhibit, for two sets A 
and B in [N], the existence of a long arithmetic progression in the sumset A + B. We do 
this by combining Theorem 14.21 — a consequence of Proposition 11.31 — with Lemma 15.11 



Proof of Theorem \1.9\ . Set 



and 



k :-- 



10 



n 



a log 
log4//3 



1/4 



Assume k ^ 1, for otherwise the conclusion of the theorem is trivial. Apply Theorem 
14.21 to A and B with these parameters and S = [N]. Since 

A + B C B+[N]C [2N] 

we may certainly take K2 = 2/a and = 2/13, and we may take Ki = K2 by (12. 4p . 
This gives us a set T C [A^] of size 6N, where 

5 ^ exp (-300fc2(log4//3)(log2n)/a) , 

such that A + B contains a translate of any subset P of kT — kT of size at most n. By 
Lemma 15.11 we can find an arithmetic progression P of length n in kT — kT provided 

6 ^ 2^''^^/N^/^''^^\ 

a condition that may be seen to hold by a short calculation. □ 

Remark 6.1. In contrast to previous proofs of results of the form of Theorem II. 9^ 
there was no need for us to embed the sets A and 5 in a finite group Zp for some 
prime p larger than N in order for us to carry out our analysis. Had we performed this 
embedding into Zp, however, we would have been able to use a slight simplification of 
Lemma ISTTl since we would only need to use it for = 1. 
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Remark 6.2. By a minor modification of tlie proof of Proposition 13.31 — using tlie tri- 
angle inequality to get rid of the terms lyi * 1^ (a^) instead of /xc * Is (x) — one can deduce 
that P C A + C for a very small set C C B. One thus needs to translate A by very few 
elements of B in order to generate long arithmetic progressions. 



We similarly get the following local version of Theorem 11.91 

Theorem 6.3 (Arithmetic progressions in small sumsets). Suppose A and B are finite, 
non-empty subsets of an abelian group such that 

\A + B\^ Ki\A\ and \A + B\ ^ K2\B\. 

Then A + B contains an arithmetic progression of length at least 

\og\A\ 



I exp 



i^i log 27^2 



where c > is an absolute constant. 



Proof. This proof is virtually the same as that above. Set 

log|A| 



k :-- 



j_ 

10 



and n := 2*^+^. As before we apply Theorem 14.21 to A and B with these parameters, but 
this time with S = A. Thus we get a set T C A of size 

|r| ^ exp (-150^2^1 (log 27^2) (log 2n)) \ A\ 

such that A + B contains any subset of kT — kT of size at most n. By the Pliinnecke- 
Ruzsa inequality, Theorem 12. 2[ we have that |A + A| ^ KiK2\A\. Another routine 
calculation now shows that we can apply Lemma 15.11 to T C A to find an arithmetic 
progression of length n in kT—kT, which yields the result. (Note again that the theorem 
is trivial if < 1.) □ 

Remark 6.4. Recall that arithmetic progressions may be degenerate in some groups; 
consider for example the group 

Remark 6.5. Other local versions of this result are possible: we could for example 
work relative to a set 5* of small doubling such that |i? + 5*1 ^ K\B\] this would yield 
slightly better bounds. 



We cannot mention this topic without drawing the reader's attention to a remarkable 
construction [33] of Ruzsa that places a limit on the potential strength of results of the 
above form: 

Theorem 6.6. Let e > 0. For every prime p > Po{^) there is a symmetric set A ^ 
of size at least (1/2 — e)p such that A + A contains no arithmetic progression of length 

exp ((logp)2/^+^) . 

Let us also mention that if one only wishes to find arithmetic progressions of length 
about logA^ in A + B then better results are available: one can work with much sparser 
sets than those considered in this paper by using the results in [12j. 
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7. Roth's theorem 



In this section we give our proof of Theorem ll.lOi We shall employ a density- increment 
strategy, showing that if A C {1, . . . , A^} is large and contains no three-term progressions 
then we can find a long arithmetic progression on which A has significantly increased 
density. We can then iterate this argument in order to obtain a contradiction. 

Let us introduce some notation before we begin. We denote the sum of a function 
/ : Z — 7- R with finite support over the three-term progressions in Z by T^lf); thus 

^3(/) := E.,yez /(x)/(y)/(2y -x) = Ey fivKf * /)(2y). 

Note that we may drop parts of subscripts when the meaning is clear. If f = 1^ is the 
indicator function of a set then T^if) is simply the number of three-term progressions 
in A. Note that this includes trivial (constant) three-term progressions and that it 
counts {x,x + d,x + 2d) separately from {x + 2d, x + d,x). We shall use the notation 
fix to denote the normalized indicator function of a finite set X. For a subset 

^4 of X we shall say that A has density a relative to X if \A\ = a\X\] when X is 
clear from the context we shall refer to a simply as the density of A. Finally, we write 
^x&xfix) = 1^ Y.xex fi^) the average of / over X. 

The core of our proof of Roth's theorem lies in the following proposition. 

Proposition 7.1. Let e > and suppose that A C [A^] has size aN . Then there is a 
symmetric arithmetic progression P C [— of length at least 

e2logiV^^/=^' 



\P\ ^ cexp c , , 

where c> is an absolute constant, such that 

\miA*^p)-n{lA)\ ^ e\A\\ 

Proof. Let Q be the arithmetic progression obtained from Corollary 15.41 applied to A 
with parameter e^; thus Q is large, Q = —Q and Q C [—N/2, N/2]. Let P be a 
symmetric subprogression of Q of length at least \Q\/S such that 4P C Q; thus P C 
[— A^/8]. We claim that this P satisfies the conclusion of the proposition. Indeed, 

Ts{1a * fip) = E(y^2^^)gp3 Y,x '^a{x)1a * 1a{2x + 2y - z - w) 

and so 

\T3{1a * l^p) - T3(1a)| = |E,,,,^eP (U * 1a(2x + 2y-z-w)-lA* U(2a;)) | 

^ |v4|^/^ E^,^ep||lA * 1^(3; - y - z - w) - Ia* 1^(3^)112 



^ e\A 



2 



these inequalities being instances of the triangle and Cauchy-Schwarz inequalities and 
the fact that P + P + 2-P CQ. □ 



We also require a preliminary lemma about T3. The following lemma gives a lower 
bound for the minimal number of three-term progressions that a set (or a function) 
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can contain given upper bounds on the function r^; it is a quantitative version of an 
averaging argument of Varnavides [IS] . 

Lemma 7.2 (Varnavides' theorem). Let N be a positive integer and suppose that f : 
[N] — 7- [0, 1] IS a function with average KxeiN]f{x) = a. Then, for any positive integer 

Tsif) >{a- M-'N\ 

The proof of this lemma proceeds via a double-counting argument and can be found in 
[13] for the case when / is the indicator function of a set. In order to pass from a result 
about sets, like the lemma stated in [1^ , to a result about a function / one can employ 
a standard probabilistic trick of defining a random set A in [N] by letting x E A with 
probability f{x) independently for each x. See [IH Exercise 10.1.7] for more details. 

We are now ready to proceed with the main body of the proof. We shall prove Theorem 
11.101 in the following equivalent form. 

Theorem 7.3. For any c > there are positive numbers C and Nq such that 

r3(A^) ^ CA^/(loglogAr)'= 

for allN^ No. 

Proof. We begin by establishing the theorem for some c > 0; we shall then be able to 
bootstrap this to establish the full result. Various inequalities in the argument will hold 
by the assumption that is large enough; we shall not state this assumption explicitly 
each time it is used. 

Let A be a subset of {1, . . . , A^} of size aN = r^lN) that does not contain any non- 
trivial three-term progressions, and let e > be a parameter that is to be fixed later. 
Applying Proposition 17.11 to A we obtain a long arithmetic progression P such that 

\n{iA*fip)-n{u)\^e\A\\ (7.1) 

Our argument will be centred around the function 

1a*I^p{x) = \An{x-P)\/\P\; 

we shall show that if < 5 < 1 is chosen appropriately then there must be an x for 
which 

\Ar]{x-P)\ > 5-^a\P\. 
This will form the base of our density increment argument. 

Suppose, then, that 1a * fJ'p{x) ^ 6~^a for all x E Z. Let /(x) := (5/a)l^ * yUp(a^), so 
that ^ f{x) ^ 1 for all x, J2x fi^) = and 

T3{f) = {6/afT,{lA*fip). 

Note also that / is supported on A + P O [1 — 9A^/8] fl Z, an interval of size at 
most 5N/4. Now, A contains only trivial three-term progressions and so T^IIa) = \A\. 
Thus (17. ip implies that 

Tsif) ^ 26\N^/a (7.2) 
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provided e ^ Vl^l- On the other hand, Lemma [7.21 tells us that 
provided M ^ N^/^^ /2. 

Let us initially pick 5 = 9/10. One may check by hand that r3(10) = 5; by picking 
M = 10 we therefore see that T^lf) ^ CqN"^ for some positive absolute constant Cq. 
Comparing this to (17. 2p we see that we obtain a contradiction provided we pick e = Cia 
for some small constant Ci > 0. (This is permissible provided a ^ l/^/clN, which we 
assume.) Hence we must have that 

\Ar]{x-P)\ ^ fa\P\ 

for some integer x, where P is a rather long progression. Let us assume that a ^ 
(log AT) -1/^ Then 

|P| ^ exp ((logiV)^/^) ; 

we have thus shown that A has density at least y ^ ^^iN) j arithmetic progression 

of length A''i := \P\. We may thus rescale to obtain a set C {1, . . . , Ni} that is also 
free of arithmetic progressions, but that is now much denser than the original set A. 

We may now iterate this argument, obtaining a sequence of integers Nj with 

iV, >exp((logiV,_i)i/«) 
and a sequence of densities 6j such that 

> (f (^) . 

the only requirements for proceeding to the next stage of the iteration being that Sj ^ 
(log iV^) -1/6 and Nj ^ C for some absolute constant C. Since no 5j can exceed 1, this 
iteration must stop at some stage K with K ^ ^"^li^^io/g^^^ ; at which point one of these 
requirements must fail. From this we may deduce that 

(log log A^) i°g8 

for some absolute constant C. 

This proves the theorem for a fixed exponent c of log log A^. We may now use this to run 
the argument again, except that we do not now need to rely on numerical data in order 
to apply Lemma 17.21 effectively. That is, we may now pick 6 arbitrarily small and then 
find a fixed value M for which |5- ^^^^^^ ^ 5/2. This means that, instead of obtaining 
a density increment of a factor of ^, we may obtain an increment of an arbitrarily large 
factor 5-1, still on a progression of length at least exp ((log A^)^/®) (though we now need 
A^ to be large enough in terms of 5). Following the above argument through again, this 
shows that 

^ 

(log log A^) i°g8 

for A^ ^ Nq{5) and some constant C depending on 5. □ 
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8. Strong approximate groups 

Finally we prove Proposition II. IH the result about strong approximate groups; recall 
that we say that A is a strong i^-approximate group if 1^ * 1a{x) ^ 1^1/-^ for each 
X & A^. This proposition does not follow directly from the almost-periodicity results; 
instead it uses the ideas in the proofs of those results in a slightly different way. 

Proof of Proposition M . 1 1[ We shall show that if C C A is chosen at random then CAk. 

with good probability. Indeed, let us start by picking a random set C C A of size 
k. By the hypothesis on A, any x E A^ that satisfies \^c * ^a{x) — 1a * 1^(3^)1 < 1^1/-^ 
lies in CA, whence 

P(a: iCA)^¥ * 1a(x) - U * Ia{x)\ ^ \A\/K) ^ 2e~^^/^\ 

the latter inequality being a standard distributional inequality for hypergeometric dis- 
tributions; see, for example, |TT] (and cf. Proposition lA.Sp . Summing this over all 
X E A^ we obtain the estimate 

E|{x G ^2 : X ^ CA}\ ^ 2e-'^/^Vl- 
Markov's inequality therefore yields 

P {\A^ A CA\ ^ \\A^\) ^ 1 - 2e~^^/^'/\- 
let us pick A := Ae~'^^^^^ to make this probability be at least 1/2. 

Now note that \A^\ ^ -ft'l^l; this follows from the inequality 1a * 1a{x) ^ |^|lA2(a;)/-ft' 
holding for all x. As in the proof of Proposition 13. this means that there is a set C 
and a set T C A~^ of size at least \A\/{2K)^ such that 

\A^ /\tCA\ ^ \\A^\ 

for any t E T. For any two elements ti,t2 E T we therefore have 

\t2t^^A^AA^\ ^ 2X\A'^\ 

by the triangle inequality. Thus we may take S := TT~^ after choosing k := \{K'^ log 8/e)/2'| . 

□ 

Remark 8.1. It is easy to see that a strong ii"-approximate group must have small 
doubling, 1^4^ | ^ -^1^1 , but unlike with sets of small doubling it is not clear how 
abundant strong X-approximate groups of different sizes are, even in the group Zp for 
a prime p. Konyagin [271 Problem 5] raised the basic question of whether it is the case 
that for any set A C Zp of size at most ^Jp there exists some element x E A + A such 
that 1a * lA(a;) ^ Cl^l^"*^, where C, c > are absolute constants. Partial progress 
was made on this question by Luczak and Schoen [28], who also noted that work of 
Green and Ruzsa [2T] implies that one can always find an x & A + A with 1^ * 1a (2;) ^ 
max 1, |v4|/(log2p)^/2+°(^)). The resuhs of this paper can be used to derive a bound 
similar to this, if perhaps slightly stronger, but we do not pursue this here. 

9. Further remarks 
We conclude with some remarks. 
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9.1. Convolutions of functions. Although we have focused on convolutions of sets in 
this paper, it is relatively easy to deduce results for convolutions of functions. Indeed, 
let /, (7 : G — 7- [0,1] be two functions with finite supports 5*/ and 5*^. Define random 
sets A,B C G by stipulating that x & A with probability f{x) and x & B with 
probability g{x), all independently. One may then use a concentration inequality such 
as Chernoff's inequality |1H Theorem 1.8] to deduce that there is a choice of sets A C S"/ 
and B C Sg such that A has size very close to ^ /, B has size very close to ^ and 
\^a*^b{x) — f * g{x) \ is small for every x G Si + 82- An almost-periodicity result saying 
that 

\\f*g{tx)-f*g{x)\\l^e'{j:f) {Y.9f 
for every t G TT^^ for a large set T then follows from the corresponding result for 
sets, and similarly for L^-almost-periodicity. One may then deal with arbitrary real- 
valued functions with finite support by rescaling. It is also likely that one can prove the 
almost-periodicity results directly for functions, though the statements will look slightly 
different; we do not pursue this here. 

9.2. Comparisons with Fourier-analytic results. Our proofs of the almost-periodicity 
results in this paper have been combinatorial, which meant that there was no need for 
us to distinguish between abelian and non-abelian groups. When dealing with finite 
abelian groups, however, it is possible to derive results similar to Corollaries 11.21 and 
11.41 using Fourier analysis. Indeed, in the abelian setting Corollary 11.21 is essentially 

a result of Bogolyubov [2] coupled with a result of Chang [10] on the large spectra of 
subsets of abelian groups; see Lemma 4.36 and (the proof of) Proposition 4.39 in [H] . 
An important difference between the two approaches is that Fourier analysis provides 
one with more information about the set T: one may take it to be a so-called Bohr 
set (an approximate annihilator of a set of characters in the Pontryagin dual of G), 
and it is well known that Bohr sets are arithmetically structured sets. For instance, 
Bohr sets contain long arithmetic progressions, which means that one does not need 
to appeal to structure-generation results like Lemma 15.11 If one uses this as the base 
for the arguments of ^ (set in Z^r rather than [A^]) then one can obtain a bound for 
r^{N) similar to that of an old but recently published proof of Roth's theorem due to 
Szemeredi |1D]; indeed, our argument is in some ways quite similar to Szemeredi's. We 
present further details of this argument in the note [2]. 

It is much less clear that one can obtain an L^-almost-periodicity result of a type similar 
to Corollary II. 41 for abelian groups using Fourier analysis. One may extract such a result 
from the paper [4j of Bourgain that exhibits the existence of long arithmetic progressions 
m A + B\ indeed, the main thrust of the paper is to establish the estimates required 
to prove such an almost-periodicity result. Specifically one can obtain a result of the 
following type. 

Proposition 9.1. Let G he a finite abelian group and let e > and m G N &e two 
parameters. Suppose that f,g : G — )■ [0,1] have averages Ex^cfi^) = « cmdE^^Ggi^) = 
p. Then there is a Bohr set B = B(r,p) of rank |r| <C m^log(l/e)/e^ and radius 
p = ce^/m such that 

\\f*g{x + t)-f* g{x)hn, <: e(«/3)i/2|^|i+i/2- 

for each t E B. 
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By a Bohr set B{r, p) here we mean a set of the form 

{x e G : |7(x) - 1| ^ p for all 7 G T}, 
where F C G is a collection of characters. 

Bourgain's argument is very elegant though also somewhat complex, relying on some 
quite sophisticated manipulations of sets of Fourier coefficients. We shall not say more 
about this here, save for making two comments. First, the set B produced by the above 
proposition will in general be somewhat smaller than the set T given by Corollary 
II. 4[ but is also guaranteed to contain more structure, which is ultimately what yields 
Bourgain's superior exponent of 1/3 in place of our 1/4 in the length of the arithmetic 
progressions one finds in A + B. Second, if one wishes to compare the norm to a/3, 
say, then Corollarv ll.4l is useful even if one of the sets A and B is rather sparse whereas 
Proposition 19.11 requires both sets to be quite large. More details about Proposition 19.11 
may be found in the note [39] . 

Obtaining the local versions of our results using Fourier analysis seems harder. We note 
that there are tools that get around this to some extent; notably there is the 'modelling' 
lemma of Green and Ruzsa [20] that allows one to 'isomorphically embed' a set A C G 
with small doubling \ A + A\ ^ K\A\ as a dense set A' CG', where \A'\ ^ f{K)\G'\. See 
for example the paper [38] of Sanders for an efficient proof of a local version of Roth's 
theorem that makes use of this lemma. Interestingly, modelling results of the same kind 
cannot exist for non-abelian groups [H]. 



9.3. Roth's theorem in other settings. In this paper we proved Roth's theorem in 
the setting of the integers {1, . . . , A^}. The Fourier-analytic proofs of Roth's theorem 
generally become simpler when studied in the vector space Fg over the finite field F3 (or 
Fp for a fixed prime p), and this holds true for our argument as well. There are two main 
reasons for this. One is that it is very easy to establish a result similar to Lemma [5?T] in 
F3, as remarked in §51 The other is that it becomes easier to run through the density 
increment strategy itself, since one can induct on subspaces rather than on arithmetic 
progressions. In particular one does not really need a result corresponding to Lemma 17^ 
(Varnavides' theorem). The bounds one obtains for r3(F3) are not significantly better 
than the corresponding ones for r^lN) with ^ 3", however. 

We should mention in this context that Seva Lev has recently produced a proof [26] of 
the Fp-version of Roth's theorem that removes the use of characters from the general 
framework of Meshulam's proof [2^. Lev's proof involves very different ideas to those 
of this paper, however. 



9.4. Extensions. There are many possible potential extensions of the methods pre- 
sented in this paper. It seems likely that the ideas used could also be used to tackle 
locally compact groups, this being a natural setting for many of the results considered 
here (where we have only dealt with discrete groups). An area of application that we 
have not discussed in detail in the current paper is that of Freiman-type results; let us 
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for now remark that it is easy to obtain a number of rudimentary Freiman-type theo- 
rems by couphng our almost-periodicity results with so-called covering lemmas. This 
will be followed up elsewhere. 

Appendix A. The moments of the binomial and hypergeometric 

DISTRIBUTIONS 



As noted in the proof of Proposition 13. 3[ if one selects a random A;-element subset C 
from a set A in an ambient group G then, for any fixed element x G G, the random 
variable Ic * ^b{x) follows a hypergeometric distribution. In this appendix we prove 
the bounds of Lemma 13.21 on the moments of such a distribution. 



Recall that X follows a hypergeometric distribution with parameters iV, M and k if 

'M\ (N -M\ I (N^ 



P(X=j) 



so that X can be thought of as counting the number of marked objects selected when 
k objects are picked without replacement from a population of N objects, M of which 
are marked. If the k objects are selected with replacement then the number of marked 
objects selected follows a binomial distribution with parameters n = k and p = M/N, 
and the two distributions are closely related. We have found certain estimates for 
the binomial distribution to be more readily available in print than the corresponding 
estimates for the hypergeometric distribution; the following corollary of a result of 
Hoeffding [231, Theorem 4] allows us to make use of these results. 

Proposition A.l. Let X follow a hypergeometric distribution as above and let Y follow 
a binomial distribution with parameters n = k and p = M/N. Then for any convex, 
continuous function f we have 

Ef{X) ^ Ef{Y). 
In particular, for m ^ 1/2 we have 

E|A:-M1|2™ ^ElY-npl^"". 

Lemma 13.21 therefore follows immediately from the following proposition. 

Proposition A. 2. Let m ^ 1 and suppose that X follows a binomial distribution with 
parameters n and p . Then 

E\X - np\'^"' ^ 2{3mnp + m^)"" . (A.l) 

In order to prove this we shall make use of the following deviation estimates, the type of 
which is often associated with the names of Bennett, Bernstein, Chernoff and Hoeffding. 

Proposition A.3. Let X follow a binomial distribution with parameters n and p. Then 

t^ 



P(X ^np-t) exp (A.2) 

2np J 

ani P(X>np + *)<exp(-^^-^) (A.3) 
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for any t ^ 0. 

Proofs of these estimates may be found in see also |3] and [T]. They can be derived 
from an apphcation of Markov's inequahty to the random variable e'^^^~"^^ using the 
fact that the moment generating function Ee'^*-'''"""^'' is e~^^^{pe^ + 1 — p)"". 

Proof of Proposition \A.^ We may write 

POD 

E|X-np|2™= / F{\X - np]^"" > t) dt. (A.4) 
Jo 

Since P(|X — np\ > t) = P(X < np — t) + P(X > np + t) we may decompose the 
right-hand side of (]A.4|) as a sum of two integrals /~ and in an obvious way. The 
deviation estimates (lA.2p and (lA.Sp then give 

rco / j-l/m\ 

^ / exp dt = {2np)"'r{m + 1) 

Jo V 2r;,p / 

and 

'"^l ^^Pi2(„p + «.A../3)j^'- 

We split the range of integration of this latter integral into two parts Ii and I2 defined 
as follows. Let A := | + + Qnp/m, so that 9(Am)^/2(?7,p + Am) = 3m; Ji is then 
the integral over the range ^ t ^ (SAm)^"* and I2 the integral over the remaining 
range. Thus 

f°° ( t^l'^ \ 

h ^ exp -— dt = (2np + 2Am)™ T(m + 1). 

Jo 2(np+Am); 

We need to take a little more care with lo. Let us write w := nf^'^T^} ^ = 3m. Then 

^ 2{n'p+\m) 

h ^ exp — — dt = 2m / z^^^^e"^ dz. 

i(3Am)2- V 2(1 + rap/ Am) y V 3Am y 

Making the change of variables u = z — w, this last integral becomes 



w 



/•OO POO 

Jo Jo 



the inequality holding since 1 + x ^ for all x, and this expression equals w^™e "'/m. 
Thus 

I2 ^ 2(3Am)2'"e-2™. 

Combining these estimates for /~ and = Ji + I2 we obtain 

E|X - np|2"^ ^ (2np)'"r(m + 1) + {2np + 2Am)"r(m + 1) + 2(9AW/e^)". 

Using the easily-verifiable bound r(m + 1) ^ 2(3m/5)'" and the definition of A then 
yields flA.l|l after some routine but technical calculations. □ 
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Remark A. 4. By being a bit more careful in the above proof one could obtain some- 
what smaller values for the constants appearing in the proposition, though this is not 
particularly important for our applications. We should also remark that, although we 
only required it for binomial random variables. Proposition IA.2I holds even when X is 
a sum of independent Bernoulli random variables that are not necessarily identically 
distributed. In that setting n is the number of summands and p is KX/ n, and one may 
prove the result exactly as above since Proposition I A. 31 holds for such random variables. 
(Let us also note that Proposition IA.2I holds with different constants for sums of more 
general random variables.) 
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